Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 16 de 16
Filter
1.
Journal of Neurology, Neurosurgery and Psychiatry ; 93(9):30, 2022.
Article in English | EMBASE | ID: covidwho-2292109

ABSTRACT

Introduction Over 50% of stroke survivors have cognitive impairment. National guidelines promote early cognitive testing however, current pen-and-paper based tests are not always appropriate, typically take place in hospital and are time costly for busy clinicians. This project aimed to create an easy-to-use cognitive assessment tool specifically designed for the needs of stroke survivors. We used a computerised doctor utilising automatic speech recognition and machine learning. Methods Patients are approached if they pass the eligibility criteria of having recent acute stroke/TIA, and do not have preexisting condition i.e dementia, severe aphasia Participants could speak to the digital doctor on the ward or at home via a web-version. Results Recruitment started on 8th December 2020;We have screened 614 people assessed for suspected acute stroke/TIA at Sheffield Teaching Hospitals. Of those we have recruited 71 participants (13 with TIA) Mean NIHSS of 4.5 and mean MoCA of 24.6. We will present initial results of factors affecting participant recruitment. We will also compare the mood and anxiety screening scores used in this study to those collected via the SNAPP database. Discussion Screening was adapted due to Covid pandemic and utilising remote consent and participa- tion allowed the project to continue.

2.
International Journal of Stroke ; 18(1 Supplement):61-62, 2023.
Article in English | EMBASE | ID: covidwho-2254349

ABSTRACT

Introduction: Over 50% of stroke survivors have cognitive impairment. National guidelines promote early cognitive testing however, current penand- paper based tests are not always appropriate, typically take place in hospital and are time costly for busy clinicians. This project aimed to create an easy-to-use cognitive assessment tool specifically designed for the needs of stroke survivors. We used a computerised doctor utilising automatic speech recognition and machine learning. Method(s): Patients were approached if they pass the eligibility criteria of having recent acute stroke/TIA, and do not have pre-existing medical condition i.e dementia, severe aphasia or too medically unwell to complete the assessment. Participants completed the computerised doctor or "CognoSpeak" on the ward using a tablet or at home via a web-version (on home computer or tablet). The assessment included the GAD and PHQ9. All had standard cognitive assessment done with the Montreal Cognitive Assessment (MOCA). Result(s): Recruitment started on 8th December 2020 and is on-going. 951 people were screened and 104 were recruited. 49 have completed baseline Cognospeak, 8 have withdrawn and 3 have died. The mean NIHSS was 3.8 and mean MoCA of 23.9, 31 were female. Participants had a mean education level of 17 years. Conclusion(s): Preliminary data will be presented highlighting feasibility of an automated cognitive and mood assessment that can be completed at home and on the Hyper-acute Stroke Unit. Screening was adapted due to Covid pandemic and utilising remote consent and participation allowed the project to continue.

3.
World Conference on Information Systems for Business Management, ISBM 2022 ; 324:519-534, 2023.
Article in English | Scopus | ID: covidwho-2279320

ABSTRACT

The lockdown due to Covid-19 has resulted in us relying heavily on technology to communicate and see each other. That, in turn, has led to an explosion of applications that provide audio and video conferencing solutions;which, in turn, has resulted in the generation of vast volumes of visual and auditory data. This ocean of data is mainly untapped and has vast potential to help us build applications that can one day hope to, truly, assist people. All this data, particularly auditory data, can be used to build applications that can support us by communicating with us and understanding what we say, making it comfortable and closer to humans. This data can also be processed to give us more insights into the behaviour of humans and what makes us truly us. All this and many more insights, which are waiting to be tapped, can be garnered from this goldmine of data, thereby allowing us to bridge the huge gap between man and machine. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

4.
Journal of Pharmaceutical Negative Results ; 14:577-585, 2023.
Article in English | EMBASE | ID: covidwho-2226818

ABSTRACT

Nobody might at any point imagine that this world would come at a halt in 2020, when the Covid 19 previously hit nobody accepted it could get such gigantic changes which would change the world as far as we might be concerned. It welcomed on many changes like work-from-home, social separating, changes in how cleanliness is kept up with and with it hits to various enterprises as well as an opportunity to arrive at new levels regarding innovation, particularly in lodgings. With the requirement for contactless assistance during the pandemic, the upsides of an AI attendant turned out to be significantly additionally articulated. The study descriptive in nature and adopted snowball sampling for collecting the data. The study the impact of covid -19 on the usage of artificial Intelligence, regression analysis was applied and found that among AI and RS AI (Chat-bots, Motion Detectors, Voice Recognition System) and RS (Online Reservation AI TOOLS INFLUENCING GUEST IN HOTELS 69 Portal), RS is relatively more important than the AI in explaining the guest intensity to stay. Study also explained that customer age is not significantly (0.103) impacted the guest intensity to stay in hotel. Copyright © 2023 Authors. All rights reserved.

5.
Comput Biol Med ; 153: 106517, 2023 02.
Article in English | MEDLINE | ID: covidwho-2165195

ABSTRACT

The growing and aging of the world population have driven the shortage of medical resources in recent years, especially during the COVID-19 pandemic. Fortunately, the rapid development of robotics and artificial intelligence technologies help to adapt to the challenges in the healthcare field. Among them, intelligent speech technology (IST) has served doctors and patients to improve the efficiency of medical behavior and alleviate the medical burden. However, problems like noise interference in complex medical scenarios and pronunciation differences between patients and healthy people hamper the broad application of IST in hospitals. In recent years, technologies such as machine learning have developed rapidly in intelligent speech recognition, which is expected to solve these problems. This paper first introduces IST's procedure and system architecture and analyzes its application in medical scenarios. Secondly, we review existing IST applications in smart hospitals in detail, including electronic medical documentation, disease diagnosis and evaluation, and human-medical equipment interaction. In addition, we elaborate on an application case of IST in the early recognition, diagnosis, rehabilitation training, evaluation, and daily care of stroke patients. Finally, we discuss IST's limitations, challenges, and future directions in the medical field. Furthermore, we propose a novel medical voice analysis system architecture that employs active hardware, active software, and human-computer interaction to realize intelligent and evolvable speech recognition. This comprehensive review and the proposed architecture offer directions for future studies on IST and its applications in smart hospitals.


Subject(s)
COVID-19 , Robotics , Humans , Artificial Intelligence , Speech , Pandemics , COVID-19/diagnosis , COVID-19/epidemiology , COVID-19 Testing
6.
24th International Conference on Information and Communications Security, ICICS 2022 ; 13407 LNCS:608-621, 2022.
Article in English | Scopus | ID: covidwho-2013997

ABSTRACT

The COVID-19 pandemic has led to a dramatic increase in the use of face masks. Face masks can affect both the acoustic properties of the signal and the speech patterns and have undesirable effects on automatic speech recognition systems as well as on forensic speaker recognition and identification systems. This is because the masks introduce both intrinsic and extrinsic variability into the audio signals. Moreover, their filtering effect varies depending on the type of mask used. In this paper we explore the impact of the use of different masks on the performance of an automatic speaker recognition system based on Mel Frequency Cepstral Coefficients to characterise the voices and on Support Vector Machines to perform the classification task. The results show that masks slightly affect the classification results. The effects vary depending on the type of mask used, but not as expected, as the results with FPP2 masks are better than those with surgical masks. An increase in speech intensity has been found with the FPP2 mask, which is related to the increased vocal effort made to counteract the effects of hearing loss. © 2022, Springer Nature Switzerland AG.

7.
Assistive Technology Outcomes and Benefits ; 16(Special Issue 2):45-55, 2022.
Article in English | Scopus | ID: covidwho-2010878

ABSTRACT

Opportunities to present to remote audiences require access for people with disabilities. The COVID-19 pandemic, with an imperative of social distancing, provided access challenges. Innovative tools, such as artificial intelligence (AI), as used in Automatic Speech Recognition (ASR), became available in many applications. Some community information and higher education programs considered supplying access through ASR text produced by AI software tools. This article's contribution to the field is a comparati ve analysis of some ASR software used as speech-to-text accommodations for Deaf and hard of hearing individuals in informational and educational settings. Some nuances of ASR and human Speech-to-Text-Services (STTS) practices are included. The concept of use cases for low-stakes settings and high-stakes settings are introduced. This article also provides a framework for future studies of the efficacy of ASR software used as an accommodation and best practices for using ASR software in informational and educational remote sessions. © ATIA 2022.

8.
Assistive Technology Outcomes & Benefits ; 16(2):45-55, 2022.
Article in English | ProQuest Central | ID: covidwho-2010877

ABSTRACT

Author Note Regarding person-first language: Members of the Deaf community prefer to be referred to by their identity as a Deaf person rather than a person who is deaf. This article uses the phrase "Deaf and hard of hearing individuals." Opportunities to present to remote audiences require access for people with disabilities. The COVID-19 pandemic, with an imperative of social distancing, provided access challenges. Innovative tools, such as artificial intelligence (Al), as used in Automatic Speech Recognition (ASR), became available in many applications. Some community information and higher education programs considered supplying access through ASR text produced by Al software tools. This article's contribution to the field is a comparative analysis of some ASR software used as speech-to-text accommodations for Deaf and hard of hearing individuals in informational and educational settings. Some nuances of ASR and human Speech-to-Text-Services (STTS) practices are included. The concept of use cases for low-stakes settings and high-stakes settings are introduced. This article also provides a framework for future studies of the efficacy of ASR software used as an accommodation and best practices for using ASR software in informational and educational remote sessions.

9.
2022 CHI Conference on Human Factors in Computing Systems, CHI 2022 ; 2022.
Article in English | Scopus | ID: covidwho-1874707

ABSTRACT

Deaf and Hard-of-Hearing (DHH) users face accessibility challenges during in-person and remote meetings. While emerging use of applications incorporating automatic speech recognition (ASR) is promising, more user-interface and user-experience research is needed. While co-design methods could elucidate designs for such applications, COVID-19 has interrupted in-person research. This study describes a novel methodology for conducting online co-design workshops with 18 DHH and hearing participant pairs to investigate ASR-supported mobile and videoconferencing technologies along two design dimensions: Correcting errors in ASR output and implementing notification systems for influencing speaker behaviors. Our methodological findings include an analysis of communication modalities and strategies participants used, use of an online collaborative whiteboarding tool, and how participants reconciled differences in ideas. Finally, we present guidelines for researchers interested in online DHH co-design methodologies, enabling greater geographically diversity among study participants even beyond the current pandemic. © 2022 ACM.

10.
3rd International Conference on Electrical and Electronic Engineering, ICEEE 2021 ; : 13-16, 2021.
Article in English | Scopus | ID: covidwho-1788708

ABSTRACT

The research study presents an architecture of HumanRobot Interaction (HRI) based Artificial Conversational Entity integrated with speaker recognition ability to avail modern healthcare services. Due to the Covid-19 pandemic, the situation has become troublesome for health workers and patients to visit hospitals because of the high risk of virus dissemination. To minimize the mass congestion, our developed architecture would be an appropriate, cost-effective solution that automates the reception system by enabling AI-based HRI and providing fast and advanced healthcare services in the context of Bangladesh. The architecture consists of two significant subsections: Speaker Recognition and Artificial Conversational Entities having Automatic Speech Recognition in Bengali, Interactive Agent, and Text-to-Speech-synthesis. We used MFCC features as the linguistic parameters and the GMM statistical model to adapt each speaker's voice and estimation and maximization algorithm to identify the speaker's identity. The developed speaker recognition module performed significantly with 94.38% average accuracy in noisy environments and 96.27% average accuracy in studio quality environments and achieved a word error rate (WER) of 42.15% from RNN based Deep Speech 2 model for Bangla Automatic Speech Recognition (ASR). Besides, Artificial Conversational Entity performs with an average accuracy of 98.58% in a small-scale real-time environment. © 2021 IEEE.

11.
9th International Conference on Big Data Analytics, BDA 2021 ; 13167 LNCS:201-208, 2022.
Article in English | Scopus | ID: covidwho-1750588

ABSTRACT

With the ever-increasing internet penetration across the world, there has been a huge surge in the content on the worldwide web. Video has proven to be one of the most popular media. The COVID-19 pandemic has further pushed the envelope, forcing learners to turn to E-Learning platforms. In the absence of relevant descriptions of these videos, it becomes imperative to generate metadata based on the content of the video. In the current paper, an attempt has been made to index videos based on the visual and audio content of the video. The visual content is extracted using an Optical Character Recognition (OCR) on the stack of frames obtained from a video while the audio content is generated using an Automatic Speech Recognition (ASR). The OCR and ASR generated texts are combined to obtain the final description of the respective video. The dataset contains 400 videos spread across 4 genres. To quantify the accuracy of our descriptions, clustering is performed using the video description to discern between the genres of video. © 2022, Springer Nature Switzerland AG.

12.
3rd International Conference on Advancements in Computing, ICAC 2021 ; : 341-346, 2021.
Article in English | Scopus | ID: covidwho-1714004

ABSTRACT

Accommodation is one of the basic needs for travelers, tourists, students, and employees. Accommodations range from low-budget lodges to world-class luxury hotels, but finding the preferable accommodation is undoubtedly a tedious task. And due to the COVID-19 pandemic, it has become problematic state to visit each accommodation property to check whether it's suitable for the accommodation seeker, considering the location, environment, and to check if the property matches the user's preferences. There have been incidents reported where thousands of people have been victimized because of contract breaches in the accommodation and real estate sectors, recurring from contract alterations. Considering these problems, we have proposed a system to provide solutions using Natural Language Processing (NLP), Automatic Speech Recognition (ASR), Augmented Reality (AR), Block-chain, and K-Nearest Neighbor (KNN). This system provides an efficient approach to viewing the exterior and interior of an accommodation using 360-degree views, providing recommendations to the user based on user preferences using KNN and cosine similarity, providing security in a digital agreement using blockchain technology, and a map navigation system using ASR. With the aid of the previously mentioned techniques, a mobile application prototype is created with the possibility of future testing and implementation. © 2021 IEEE.

13.
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 ; : 2862-2873, 2021.
Article in English | Scopus | ID: covidwho-1678733

ABSTRACT

The automated transcription of spoken language, and meetings, in particular, is becoming more widespread as automatic speech recognition systems are becoming more accurate. This trend has significantly accelerated since the outbreak of the COVID-19 pandemic, which led to a major increase in the number of online meetings. However, the transcription of spoken language has not received much attention from the NLP community compared to documents and other forms of written language. In this paper, we study a variation of the summarization problem over the transcription of spoken language: given a transcribed meeting, and an action item (i.e., a commitment or request to perform a task), our goal is to generate a coherent and self-contained rephrasing of the action item. To this end, we compiled a novel dataset of annotated meeting transcripts, including human rephrasing of action items. We use state-of-the-art supervised text generation techniques and establish a strong baseline based on BART and UniLM (two pretrained transformer models). Due to the nature of natural speech, language is often broken and incomplete and the task is shown to be harder than an analogous task over email data. Particularly, we show that the baseline models can be greatly improved once models are provided with additional information. We compare two approaches: one incorporating features extracted by coreference-resolution. Additional annotations are used to train an auxiliary model to detect the relevant context in the text. Based on the systematic human evaluation, our best models exhibit near-human-level rephrasing capability on a constrained subset of the problem. © 2021 Association for Computational Linguistics

14.
Applied Sciences ; 12(2):804, 2022.
Article in English | ProQuest Central | ID: covidwho-1630164

ABSTRACT

Featured ApplicationThis work has direct application in live automatic captioning of audiovisual material, which is fundamental in accessibility.This paper describes the automatic speech recognition (ASR) systems built by the MLLP-VRAIN research group of Universitat Politècnica de València for the Albayzín-RTVE 2020 Speech-to-Text Challenge, and includes an extension of the work consisting of building and evaluating equivalent systems under the closed data conditions from the 2018 challenge. The primary system (p-streaming_1500ms_nlt) was a hybrid ASR system using streaming one-pass decoding with a context window of 1.5 seconds. This system achieved 16.0% WER on the test-2020 set. We also submitted three contrastive systems. From these, we highlight the system c2-streaming_600ms_t which, following a similar configuration as the primary system with a smaller context window of 0.6 s, scored 16.9% WER points on the same test set, with a measured empirical latency of 0.81 ± 0.09 s (mean ± stdev). That is, we obtained state-of-the-art latencies for high-quality automatic live captioning with a small WER degradation of 6% relative. As an extension, the equivalent closed-condition systems obtained 23.3% WER and 23.5% WER, respectively. When evaluated with an unconstrained language model, we obtained 19.9% WER and 20.4% WER;i.e., not far behind the top-performing systems with only 5% of the full acoustic data and with the extra ability of being streaming-capable. Indeed, all of these streaming systems could be put into production environments for automatic captioning of live media streams.

15.
Journal of Medical Internet Research ; 23(11), 2021.
Article in English | EMBASE | ID: covidwho-1553930

ABSTRACT

Background: Prior studies have demonstrated the safety risks when patients and consumers use conversational assistants such as Apple's Siri and Amazon's Alexa for obtaining medical information. Objective: The aim of this study is to evaluate two approaches to reducing the likelihood that patients or consumers will act on the potentially harmful medical information they receive from conversational assistants. Methods: Participants were given medical problems to pose to conversational assistants that had been previously demonstrated to result in potentially harmful recommendations. Each conversational assistant's response was randomly varied to include either a correct or incorrect paraphrase of the query or a disclaimer message-or not-telling the participants that they should not act on the advice without first talking to a physician. The participants were then asked what actions they would take based on their interaction, along with the likelihood of taking the action. The reported actions were recorded and analyzed, and the participants were interviewed at the end of each interaction. Results: A total of 32 participants completed the study, each interacting with 4 conversational assistants. The participants were on average aged 42.44 (SD 14.08) years, 53% (17/32) were women, and 66% (21/32) were college educated. Those participants who heard a correct paraphrase of their query were significantly more likely to state that they would follow the medical advice provided by the conversational assistant (X2 1=3.1;P=.04). Those participants who heard a disclaimer message were significantly more likely to say that they would contact a physician or health professional before acting on the medical advice received (X2 1=43.5;P=.001). Conclusions: Designers of conversational systems should consider incorporating both disclaimers and feedback on query understanding in response to user queries for medical advice. Unconstrained natural language input should not be used in systems designed specifically to provide medical advice.

16.
Int J Speech Technol ; 25(3): 641-649, 2022.
Article in English | MEDLINE | ID: covidwho-1372801

ABSTRACT

Researchers and scientists have been conducting plenty of research on COVID-19 since its outbreak. Healthcare professionals, laboratory technicians, and front-line workers like sanitary workers, data collectors are putting tremendous efforts to avoid the prevalence of the COVID-19 pandemic. Currently, the reverse transcription polymerase chain reaction (RT-PCR) testing strategy determines the COVID-19 virus. This RT-PCR processing is more expensive and induces violation of social distancing rules, and time-consuming. Therefore, this research work introduces generative adversarial network deep learning for quickly detect COVID-19 from speech signals. This proposed system consists of two stages, pre-processing and classification. This work uses the least mean square (LMS) filter algorithm to remove the noise or artifacts from input speech signals. After removing the noise, the proposed generative adversarial network classification method analyses the mel-frequency cepstral coefficients features and classifies the COVID-19 signals and non-COVID-19 signals. The results show a more prominent correlation of MFCCs with various COVID-19 cough and breathing sounds, while the sound is more robust between COVID-19 and non-COVID-19 models. As compared with the existing Artificial Neural Network, Convolutional Neural Network, and Recurrent Neural Network, the proposed GAN method obtains the best result. The precision, recall, accuracy, and F-measure of the proposed GAN are 96.54%, 96.15%, 98.56%, and 0.96, respectively.

SELECTION OF CITATIONS
SEARCH DETAIL